FastForward for Concurrent Threaded Pipelines

نویسندگان

John Giacomoni

Manish Vachharajani

Tipp Moseley

چکیده

The performance, cost, and flexibility of commodity multi-core systems make them appealing for threaded applications. Unfortunately, popular threading techniques require independent code regions, use expensive synchronization primitives, and use expensive communication mechanisms. Recently, researchers have proposed several Concurrent Threaded Pipeline architectures (CTP) which relax the data independence requirement and can increase computational through-put proportionately to the pipeline depth. Examples include Decoupled Software Pipelining, which focuses on compiler based extraction of pipelines from sequential codes, and the Frame Shared Memory architecture, which focuses specifically on network processing. CTP architec-tures show great promise for threading applications given a low-overhead high-speed blocking queue implementation. This paper presents the FastForward system, a novel software-only low-overhead high-speed blocking queue implementation for CTPs. FastForward uses a novel domain-specific adaptation of concurrent lock-free queues (CLF) in conjunction with a clever memory organization to provide the fast, low-overhead, queue operations. The key to FastForward's success is its domain specific optimization based on careful tuning for modern multi-core microarchitectures. Enqueue and dequeue times are as low as 35 ns, 5 times faster than the next best solution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FastForward for Concurrent Threaded Pipelines ; CU-CS-1023-07

متن کامل

Assignment 11: Software Concurrent Threaded Pipelines

Traditionally, increases in transistors and fabrication technology have led to increased performance. However, these techniques are showing diminishing returns due to limitations arising from power consumption, design complexity, and wire delays. In response, designers have turned to chip multiprocessors (CMPs) that incorporate multiple cores on a single die. The performance, cost, and flexibil...

متن کامل

Harnessing Chip-Multiprocessors with Concurrent Threaded Pipelines ; CU-CS-1024-07

Single-core performance increases have stalled. To increase available cycles, microprocessor designers have shifted to chip-multiprocessor (CMP) designs. Unfortunately, the additional processors provided by CMPs may remain idle because most applications lack dataparallelism and task-parallelism is unlikely to saturate future CMP designs. The systems community needs to rethink how systems are st...

متن کامل

Harnessing Chip-Multiprocessors with Concurrent Threaded Pipelines

متن کامل

FastForward for Efficient Pipeline Parallelism ; CU-CS-1028-07

High-rate core-to-core communication is critical for efficient pipeline-parallel software architectures. This paper presents the FastForward system, a software-only lowoverhead high-rate queue implementation for pipeline parallelism on multicore architectures. FastForward uses an architecturally-tuned domain-specific adaptation of concurrent lock-free queues to provide low-latency and lowoverhe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

FastForward for Concurrent Threaded Pipelines

نویسندگان

چکیده

منابع مشابه

FastForward for Concurrent Threaded Pipelines ; CU-CS-1023-07

Assignment 11: Software Concurrent Threaded Pipelines

Harnessing Chip-Multiprocessors with Concurrent Threaded Pipelines ; CU-CS-1024-07

Harnessing Chip-Multiprocessors with Concurrent Threaded Pipelines

FastForward for Efficient Pipeline Parallelism ; CU-CS-1028-07

عنوان ژورنال:

اشتراک گذاری